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Abstract — This paper introduces an expectation-maximization 
(EM) algorithm within a wavelet domain Bayesian framework for 
semi-blind channel estimation of multiband OFDM based UWB 
communications. A prior distribution is chosen for the wavelet 
coefficients of the unknown channel impulse response in order to 
model a sparseness property of the wavelet representation. This 
prior yields, in maximum a posteriori estimation, a thresholding 
rule within the EM algorithm. We particularly focus on reducing 
the number of estimated parameters by iteratively discarding 
"unsignificant" wavelet coefficients from the estimation process. 
Simulation results using UWB channels issued from both models 
and measurements show that under sparsity conditions, the 
proposed algorithm outperforms pilot based channel estimation 
in terms of mean square error and bit error rate and enhances 
the estimation accuracy with less computational complexity than 
traditional semi-blind methods. 



I. Introduction 

A UWB radio signal is defined as any signal whose band- 
width is larger than 20% of its center frequency or greater 
than 500 MHz [1]. In recent years, UWB system design has 
experienced a shift from the traditional "single-band" radio 
that occupies the whole 7.5 GHz allocated spectrum to a 
"multiband" design approach [2]. That consists in dividing 
the available UWB spectrum into several subbands, each one 
occupying approximately 500 MHz. 

Multiband Orthogonal Frequency Division Multiplexing 
(MB -OFDM) [3] is a strong candidate for multiband UWB 
which enables high data rate UWB transmission to inherit 
all the strength of OFDM that has already been shown for 
wireless communications (ADSL, DVB, 802.11a, 802. 16. a, 
etc.). This approach uses a conventional coded OFDM system 
[4] together with bit interleaved coded modulation (BICM) 
and frequency hopping over different subbands to improve 
diversity and to enable multiple access. 

Basic receivers proposed for MB-OFDM [3], estimate the 
channel by using pilots (known training symbols) transmitted 
at the beginning of the information frame, implicitly assuming 
a time invariant channel within a single frame. Thus, for 
an accurate channel acquisition, one must send several pilot 
patterns resulting in a significant loss in spectral efficiency. 

Recent works [5], [6] have reported promising results on the 
combination of channel estimation and data decoding process 
by using the Expectation-Maximization (EM) algorithm [7] . 
Though the latter scheme outperforms pilot based receivers, 
it has a higher complexity that may be of a critical concern 



for its practical implementations. This complexity is mainly 
dominated by the number of estimated parameters for channel 
updating and the decoding algorithm within each iteration. 

In this work, we consider a semi-blind joint channel esti- 
mation and data detection scheme based on the EM algorithm, 
with the objective of minimizing the number of estimated 
parameters and enhancing the estimation accuracy. This is 
achieved by expressing the unknown channel impulse response 
(CIR) in terms of its discrete wavelet series, which has been 
shown to provide a parsimonious representation [8], [9]. Thus, 
we choose a particular prior distribution for the channel 
wavelet coefficients that renders the maximum a posteriori 
(MAP) channel estimation equivalent to a hard thresholding 
rule at each iteration of the EM algorithm. The latter is 
then exploited to reduce the estimator computational load 
by discarding "unsignificant" wavelet coefficients from the 
estimation process. Moreover, since the probability of encoded 
bits are involved in the EM computation, we naturally combine 
the iterative process of channel estimation with the decoding 
operation of encoded data. 

This paper is organized as follows. Section introduces 
MB-OFDM and its wavelet domain channel estimation obser- 
vation model. In sectionUm we first describe a MAP version of 
the EM algorithm for channel estimation and then show how 
the number of estimated parameters can be reduced through 
the EM iterations. The combination of the channel estimation 
part with the decoding operation and implementation issues 
are also discussed. Section [V] illustrates, via simulations, the 
performance of the proposed receiver over a realistic UWB 
channel environment and section |VT] concludes the paper. 

Notational conventions are as follows: 2? x is a diagonal 
matrix with diagonal elements x = [x\, . . . , Xn] t , E x [.] refers 
to expectation with respect to x, I/v denotes an (N x N) 
identity matrix; ||.||, (.)*, (.) T and (.) H denote Frobenious 
norm, matrix or vector conjugation, transpose and Hermitian 
transpose, respectively. 

II. System model and wavelet domain problem 

FORMULATION 

MB-OFDM system divides the spectrum between 3.1 to 
10.6 GHz into several non-overlapping subbands each one 
occupying 528 MHz of bandwidth [3]. The transmitter archi- 
tecture for the MB-OFDM system is very similar to that of a 
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conventional wireless OFDM system. The main difference is 
that MB-OFDM system uses a time-frequency code (TFC) to 
select the center frequency of different subbands which is used 
not only to provide frequency diversity but also to distinguish 
between multiple users (see figures[T]and0. Here, we consider 
MB-OFDM in its basic mode ie. employing the three first 
subbands. 
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Fig. 1 . Example of time-frequency coding for the multiband OFDM system: 
TFC={1, 3, 2, 1, 3, 2, ...}. 

We consider the multiband OFDM transmission of figure [2] 
using N data subcarriers. At the receiver, assuming a cyclic 
prefix (CP) longer than the channel maximum delay spread and 
perfect synchronization, OFDM converts a frequency selective 
channel into N parallel fiat fading subchannels [4] for each 
subband as 

Yi,n = "D Si „ hi t n + Zj.n i E {1,2,3}, n = 1, . . .,N B ym 

(1) 

where (1 x N) vectors yi >n , s,- in and hj iTl denote received 
and transmitted symbols, and the channel frequency response 
respectively; the noise block z irt is assumed to be a zero mean 
white complex Gaussian noise with distribution CJV(0, u 2 1n) 
; i is the subband index and n refers to the OFDM symbol 
index inside the frame. The observation model corresponding 
to all three subbands can be written in frequency domain as 



where Y m = [yi,„,y 2 . 



1, 



(2) 



n , y3/ 

T 



S m = [Sl,n, S2,n, S3, n ] , 

H m = [hi ! „,h 2 ,„,h 3jn ] T and Z m = [zi, n , z 2 ,„, z 3 ,„] T are 
(M x 1) vectors, with M = 3N and M sym = N sym /3. In the 
remainder, unless otherwise mentioned, we will not write the 
time index m for notational convenience. 

In order to take advantage of the wavelet based estimation, 
the channel impulse response is expressed in terms of its 
orthogonal discrete wavelet coefficients. Let Fm.l be the 
truncated fast Fourier transform (FFT) matrix constructed from 
the (M x M) FFT matrix by keeping the first L columns where 
L is the length of the CIR over a group of three subbands. We 
define W as the (L x L) orthogonal discrete wavelet transform 
(ODWT) matrix. The unknown channel can be expressed as 
H = FM,iW H g, where g is the (L x 1) vector of the CIR 
wavelet coefficients. The Observation model |2] is rewritten as 

Y = 2? s Tg + Z (3) 

where T = F m ,lW w . 

Although at the transmitter, the channel is practically used 
by slices of 528 MHz bandwidth that corresponds to one of 
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Fig. 2. TX architecture of the multiband OFDM system. 



the subbands, at the receiver side we gather three received 
OFDM symbols for estimating the wavelet coefficients of the 
CIR, taken over all of the subbands (1.584 GHz bandwidth). 
This is motivated by the fact that estimating the channel over 
a wider bandwidth leads to a sparser wavelet representation. 
Besides, this approach simplifies the receiver architecture since 
there is no need to change the central frequency for down 
converting different subbands. 

III. The EM-MAP algorithm for wavelet domain 

CHANNEL ESTIMATION 

The EM algorithm proposed in this section is able to inte- 
grate the advantages of wavelet based estimation via the prior 
choosen for channel wavelet coefficients. Next, we see how 
the MAP estimator leads to a thresholding procedure which is 
used for reducing the number of estimated coefficients at each 
iteration of the EM algorithm. 

A. An equivalent model and the EM principle 

Our first step consists in decomposing the AWGN in 0) 
into the sum of two different Gaussian noise terms as 



£> Zi 



(4) 



where Zi and Z 2 are (M x 1) independent Gaussian noise 
vectors such that p(Zi) = CN{Q,a 2 I M ) and p(Z 2 ) = 
CAf(0, & 2 Im — c* 2 P s D^). Since we are using normalized 
QPSK symbols, T> S V^ = I^j and the covariance matrix of Z 2 
reduces to S 2 = (a 2 — a 2 )Ij\/. We define the positive design 
parameter p = a 2 /a 2 , (0 < p < 1) and notice that setting 
p = 1 leads to Z 2 = which is equivalent to working with 
the initial model (0). However, for < p < 1, the above noise 
decomposition allows the introduction of a hidden channel 
vector H defined as 



H =Tg + Z! 
Y = V s H + Z 



(5) 



The hidden vector H provides a direct relation between true 
and estimated wavelet coefficients corrupted by an additive 
white Gaussian noise, allowing the two-stage observation 
model (0 which is equivalent to (0. However, the difference 
with a standard denoising problem is that S and H are 
unknown. Hence, the observation model has missing datas and 
hidden variables and the MAP solution of g has no closed 
form. In such situations, the EM algorithm [7] is often used 
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to maximize the expectation of the posterior distribution over 
all possible missing and hidden variables. 

Let X = {Y, S,H} be the complete data set in the 
EM algorithm terminology. Note that the observation set Y 
determines only a subset of the space 3£ of which X is 
an outcome. We search g that maximizes logp(g|X). After 
initialization by a short pilot sequence at the beginning of the 
frame, the EM algorithm alternates between the following two 
steps (until some stopping criterion) to produce a sequence of 
estimates {g (t) , t = 0, 1, . . . , t max }. 

• Expectation Step (E-step): The conditional expectation 
of the complete log-likelihood given the observed vector 
and the current estimate g^ is calculated. This quantity 
is called the auxiliary or Q-function 



Q(g,g 



(t)\ _ 



J S,H 



logp(Y,S,H|g) 



y,g 



(6) 



• Maximization Step (M-step): The estimated parameter 
is updated according to 

g (t+i) = argmax{Q(g,g( t )) +log^(g)} (7) 

where 7r(g) is a prior distribution for the wavelet coefficients. 
Next, we derive the specific formulas of each step, according 
to ©. 



B. E-step: Computation of the Q-function 
The complete likelihood is 

p(Y, S, H|g) = p(Y\S, H, g) p(S|H, g) p(H|g). 

According to (0, conditioned on H, Y is independent of g. 
Furthermore, S which results from coding and interleaving 
of bit sequence is independent of H and g. Since Zi is a 
complex white Gaussian noise, the complete log-likelihood 
can be simplified to 

logp(Y,S,H|g) = log[p(Y|S,H)p(S)p(H|g)] 
= logp(H|g) + est. 

= 5 + CSt. 

or 

(8) 

where est. are different constant terms that do not depend on 
g. According to © we have 



Q(g>g W )=E s .H 



CSt. 



(HO) - Tg | 



est. 



(9) 



where (HW) = E g|g [H|Y, g«]. 

From (O, it is obvious that the E-step involves only the 
computation of (H^), we have 

(H<*>) = E f / fi p(H|Y,g«)dH)p(S|Y,g«) 

(10) 



where the last equation results from the independence between 
S and H belonging respectively to the sets and ,34? which 
contain all of their possible values. 

In order to evaluate (H"), we first have to evaluate the 
conditional mean u~ of H as 

H 

= / H p(H|Y,g«)dH (11) 

Since both p(Y|H) and p(Hjgw) are Gaussian densities, 
p(H|Y,gW) oc p(Y|H) p(H|g (t) ) is also Gaussian. By 
standard manipulation of Gaussian densities, we obtain 



= Tg 



(*) 



pP«(Y-P s Tg 



(*) 



(12) 



By using (fT2b in ( TTOb and after some simplifications we get 



(HW> = (l-p)Tg( t )+pPV 



(13) 



where V s = E„ e * P a p(S|Y, gW). 

The E-step is then completed by inserting (H^*^) into 
Q(g,g^), equation @. 



C. M-step: Wavelet Based MAP Estimation 

In this step the estimate of the parameter g is updated as 
given in Q where Q(6>,6» (t) ) is given by (0 

,2 \ 



g(*+i) — argmax ■ 



(H(*)) - Tg | 



+ log7r(gU. (14) 



Due to the orthonormality of both Fourier and wavelet trans- 
forms, T W T = I L and we can replace || (H^*)) - Tg || by 
|| gW -g|| , where 

~( t ) =T M(g(t)) 

= (l-p)gW+p(P s T) H Y 
The M-step can be written as 



(15) 



r(*+l) 



r(t) 



arg max ■ 
g 



+ log7r(g) 



(16) 



P 



Actually g(* +1 ' in (fT4l is no more than the MAP estimate 
of g from the observation model 

= g + z; (i7) 

where Z[ = T H Z X - CAf(0, a 2 I L ). From the Bayes theorem, 
the posterior distribution of g is given by 

g|gW) oc p(g ( * ) |g) 7 r(g) (18) 

where p(g ( -*- ) |g) is the Gaussian likelihood, g ~ CjV(g, q; 2 Il). 
In this approach, we adopt the Bernoulli-Gaussian prior dis- 
tribution 7r (g) for the wavelet coefficients g of the unknown 
CIR described by 

tt ( 9j ) = A S( 9j ) + (1 - A) CM m (0, r 2 ) (19) 

for j — 1,...,L, which allows us to model a sparseness 
property of UWB channels in wavelet domain. This amounts 
considering that the wavelet coefficients have a probability A to 
be zero and a probability 1 — A to be distributed as CA/*(0, r 2 ). 
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In order to deal with that particular model, we introduce an 
additional state variable (or indicator) [3j E {0, 1} such that 
we can express this prior conditionally as 



Csil& = o) 



with probability A, 



1) 



CAfg. (0, t 2 ) with probability 1 - A. 

(20) 

This prior model, conditionally on that state variable, leads 
to a Gaussian posterior for gj which makes the estimation 
explicit; from the direct observation model In = gj + Z[ j, 
we can express these posterior probabilities of (3j as 



p[pj = Q\$ ) ) = AW(0,a 2 )/c 



pl(3 j = l\gi t A = (1-A) A/" (0, a 2 +r 2 )/c 



(21) 



where the constant c = AW (0, a 2 ) + (1 - A) Af (0, a 2 + r 2 ). 
From this set of equations, we easily notice that the indica- 
tor variable f3j allows us to discriminate between the noise 
coefficients (for = 0) and the effective channel wavelet 
coefficients (for (3j — 1), eventually corrupted by noise. The 
indicator variables /3j are estimated, in the MAP sense, by 



P. 



(t+i) _ 



0, if p fa = Q\gf) > 0.5 



(22) 



1, elsewhere. 



Therefore, the MAP estimates of the channel wavelet coef- 
ficients are obtained by a simple denoising/thresholding rule 

as 



,(*+!) _ 



0, 



if pf +1) = 



9j 



(t+i) 



if 13. 



(t+i) 



(23) 



1) t and A updating: The prior parameters r and A stand 
respectively for the (significant)-wavelet coefficients energy 
and unsignificant coefficient probability. The update rules for 
these two parameters are MAP based rules derived from 
assigning conjugate priors to these parameters [10]: 



A = (L-l/2)/(L-l), 
r 2 = V /(L-L) 

where L = Card{j | /3j — 0} and r\ = Yla-=\ 
Card{.} denoting the set cardinality. 



(24) 



,(*+!)! 



2) Reduction of the number of estimated parameters: 
The thresholding procedure derived in this section, provides 
an easy framework for reducing the number of estimated 
coefficients. This can be done by discarding at each iteration, 
the elements of g( t+1 ) that are replaced by zero in (l23l . The 
underlying assumption is as follows: whenever the estimator 
attributes an unknown wavelet coefficient to noise (replace it 
by zero), this coefficient will always be considered as noise 
and so will not be estimated in future iterations. 

This operation is shown in figure [3] and can be modeled as: 



a(T) 



(25) 



where the truncation operator 0(.) gathers in g^* +1 ' > the 
components of g(* +1 ) that must be kept and the operator S(.) 
constructs T tr from T by keeping the rows corresponding to 
kept indexes. During the first iteration (t = 0), the algorithm 
does not perform any truncation and the EM algorithm esti- 
mates all coefficients. However, after each M-step, the number 
of unknown parameters to be estimated in the next iteration is 
reduced according ( f25l l by using and T tr in the update 

formula of the E-step (fT3l . 

IV. DECODING METHOD AND IMPLEMENTATION ISSUES 

According to equation ([Tol l, we make use of the information 
on transmitted symbols, obtained from the decoder, to update 
the channel estimate at each iteration. Besides, the decoder 
requires an estimate of the channel in order to provide the 
probability of encoded bits. Hence, the semi-blind channel 
estimation algorithm is naturally combined with the process 
of data decoding. The a posteriori probability of the unknown 
symbol Sk, p(Sk\Yki Hu ), is calculated using the a posteriori 
probabilities provided by the decoder at the end of the i-th 
iteration as 



p(S k \Yk,H) = Y[P dec (c k ,i) 



(26) 



;.=i 



where Pdoc(cfc,i) is the a posteriori probability corresponding 
to the i-th bit of Sk, Ck,i- At the first iteration, where no a 
priori information is available on bits cu,i, Pdcc(ck,i) are set 
to 1/2. 
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Fig. 3. EM-MAP channel estimation combined with the decoding process. 

Among several possible ways to practically implement a 
joint channel estimation and decoding receiver, we adopt the 
following global procedure (see figure [3J. 

• Initialization (t = 0) 

- Set all probabilities of coded bits Pdec(c k ,i) to 1/2 
and derive p(S|Y, g' -*) according to (|26| >. 

- Initialize the unknown vector g by g(°) obtained 
from pilot symbols. 

• for t = 1, . . . , i max 

- Use the current estimate to calculate g(* +1 ) 
according to (l23l . 

- Discard the wavelet coefficients that are replaced by 
zero for the next iteration by evaluating g^* and T tr 
from ((25). 

- if t ^ t max : Use the current estimate g^ to update 
the probability of encoded bits Pdcc(ck.i) and derive 
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p(S|Y, gW) from (f26j. and EM-Wav method have respectively about 0.2 dB and 0.5 

else: Decode the information data by thresholding dB of SNR degradation from the performance obtained with 
the uncoded bit probabilities with 1/2. perfect CSI. 



V. Simulation results 

In this section we present a comparative performance study 
of the proposed EM-MAP algorithm. The binary information 
data are encoded by a non-recursive non-systematic convo- 
lutional encoder with rate R — 1/2 and constraint length 3. 
Each frame has a payload of 1 KB along with 3 pilot symbols 
at the beginning for initializing the channel of each subband. 
The interleaver is random and operates over the entire frame. 
Among different wavelet families, "symmetric" wavelet basis 
functions [11] providing the sparser representation [9] have 
been considered. Unless otherwise mentioned, the curves are 
obtained after t max = 4 iterations. 

First, a sparse channel model where only 20 wavelet coef- 
ficients out of total 96 have non zero values, is considered. 
The second channel, referred to as Corridor, is a line of sight 
(LOS) scenario issued from realistic UWB indoor channel 
measurements [12] where the receive and transmit antennas 
are located in a corridor separated by 9 meters. 

Performance comparison is made with two pilot-only based 
approach using ML and minimum mean square error (MMSE) 
channel estimation, referred to as pilot-ML and pilot-MMSE. 
We also compare the proposed algorithm with two semi-blind 
channel estimation based on the EM algorithm, called respec- 
tively EM-Freq and EM-Wav. The first approach, consists of 
estimating the channel over all of the three subbands, using 
the model (O, similar to [5] while the second scheme is a 
wavelet domain EM based estimation of the channel where 
the prior model is set to have a uniform distribution. 

Figure [4] depicts the mean square error (MSE) between 
true and estimated channel as a function of E^/Nq. It can be 
noticed that, although the pilot-MMSE approach improves the 
estimation accuracy for low SNR values, the performance of 
pilot based channel estimation methods are very far from the 
family of semi-blind methods. Comparing the wavelet domain 
semi-blind approach (EM-Wav) and the frequency domain 
approach (EM-freq), shows that significant gain is achieved 
by the former method. As shown, the best performance is 
achieved by the EM-MAP method. We see that by using 
EM-MAP, a gain of almost 4 dB in SNR is achieved at 
MSE=2 x 10~ 3 , as compared to the EM-Wav method. This 
clearly shows the adequacy of the EM-MAP method for the 
case where the unknown channel has few non zero wavelet 
coefficients, which is in perfect agreement with the prior 
model. 

Figure [5] shows the BER results along with the BER for 
the case of perfect channel state information (CSI). It can 
be seen that at a BER of 10~ 3 , the pilot-ML and the EM- 
Freq approaches are respectively 3.9 and 2 dB of SNR far 
from the BER obtained with the perfect channel. Furthermore, 
the performance of the Pilot-MMSE approach is not shown 
since it was very close to that of Pilot-ML. Also, we observe 
that wavelet based semi-blind methods perform closely to the 
perfect CSI case. For example, at BER=10~ 4 , the EM-MAP 
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Fig. 4. Mean square error between the true and estimated coefficients for 
the sparse channel model. 
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Fig. 5. BER performance of the EM-MAP method over the sparse channel 
model. 



We now evaluate the performance of EM-MAP by con- 
sidering the Corridor channel. Figure [6j shows that wavelet 
based methods again outperforms pilot based and EM-Freq 
methods in terms of MSE and BER. However, the EM-MAP 
performance is now comparable to that of EM-Wav method. 
This can be explained by noting that when the channel is 
not sparse, small values are attributed to A by the algorithm 
(see d24l). This leads to a gaussian prior model with a 
large variance compared to the noise variance, which can be 
approximated with a uniform prior. As a results, the prior 
becomes less informative and the EM-MAP performs close 
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to EM-Wav, as shown in figures [6] Thus, the proposed EM- 
MAP algorithm is able to adapt its prior model parameters for 
each propagation environment. 
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Fig. 6. Mean square error between the true and estimated coefficients over 
the Corridor channel. 



Figure [7] shows the average number of estimated parameters 
versus the iteration number different channel scenarios. As 
observed, the EM-MAP approach tends to reduce significantly 
the number of estimated parameters. This can be seen for the 
sparse channel where the number of estimated parameters is 
reduced up to 20 parameters at the fifth iteration. Furthermore, 
under non-sparse Corridor channel, the figure shows that 
EM-MAP method is preferred to EM-Wav, due to its lower 
computational load. 
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VI. Conclusion 

This paper proposed a semi-blind MAP channel estimation 
algorithm that integrates the advantages of wavelet based 
estimation. The investigated method naturally combines the 
EM iterations with the decoding process. We derived an equiv- 
alent data model for the multiband OFDM system involving 
the channel over all 3 subbands expressed in the wavelet 
domain. By choosing a Bernoulli-Gaussian prior distribution 
for the channel wavelet coefficients, the MAP estimator yields 
a thresholding procedure at the M-step of the EM algorithm 
which we used to reduce the number of estimated coefficients. 
With only few iterations, the EM-MAP method provides 
significant reduction in the number of estimated parameters 
and outperforms all considered pilot based and semi-blind 
methods. 
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Fig. 7. Reduction of the number of estimated parameters through iterations, 
Eb/NO = 8 dB. 



